Search CORE

251 research outputs found

Deep supervised learning using local errors

Author: Cauwenberghs Gert
Mostafa Hesham
Ramesh Vishwajith
Publication venue
Publication date: 17/11/2017
Field of study

Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers. Learning using delayed and non-local errors makes it hard to reconcile backpropagation with the learning mechanisms observed in biological neural networks as it requires the neurons to maintain a memory of the input long enough until the higher-layer errors arrive. In this paper, we propose an alternative learning mechanism where errors are generated locally in each layer using fixed, random auxiliary classifiers. Lower layers could thus be trained independently of higher layers and training could either proceed layer by layer, or simultaneously in all layers using local error information. We address biological plausibility concerns such as weight symmetry requirements and show that the proposed learning mechanism based on fixed, broad, and random tuning of each neuron to the classification categories outperforms the biologically-motivated feedback alignment learning technique on the MNIST, CIFAR10, and SVHN datasets, approaching the performance of standard backpropagation. Our approach highlights a potential biological mechanism for the supervised, or task-dependent, learning of feature hierarchies. In addition, we show that it is well suited for learning deep networks in custom hardware where it can drastically reduce memory traffic and data communication overheads

arXiv.org e-Print Archive

Frontiers - Publisher Connector

eScholarship - University of California

A Charge-Based CMOS Parallel Analog Vector Quantizer

Author: Cauwenberghs Gert
Pedroni Volnei
Publication venue: The MIT Press
Publication date: 01/01/1995
Field of study

We present an analog VLSI chip for parallel analog vector quantization. The MOSIS 2.0 μm double-poly CMOS Tiny chip contains an array of 16 x 16 charge-based distance estimation cells, implementing a mean absolute difference (MAD) metric operating on a 16-input analog vector field and 16 analog template vectors. The distance cell including dynamic template storage measures 60 x 78 μm^2. Additionally, the chip features a winner-take-all (WTA) output circuit of linear complexity, with global positive feedback for fast and decisive settling of a single winner output. Experimental results on the complete 16 x 16 VQ system demonstrate correct operation with 34 dB analog input dynamic range and 3 μsec cycle time at 0.7 mW power dissipation

Caltech Authors

Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks

Author: Cauwenberghs Gert
Mostafa Hesham
Pedroni Bruno
Sheik Sadique
Publication venue
Publication date: 01/01/2017
Field of study

Artificial neural networks (ANNs) trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to developing custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation. Learning is performed in parallel with inference in the forward pass, removing the need for an explicit backward pass and requiring no extra weight lookup. By using binary state variables in the feedforward network and ternary errors in truncated-error backpropagation, the need for any multiplications in the forward and backward passes is removed, and memory requirements for the pipelining are drastically reduced. Further reduction in addition operations owing to the sparsity in the forward neural and backpropagating error signal paths contributes to highly efficient hardware implementation. For proof-of-concept validation, we demonstrate on-line learning of MNIST handwritten digit classification on a Spartan 6 FPGA interfacing with an external 1Gb DDR2 DRAM, that shows small degradation in test error performance compared to an equivalently sized binary ANN trained off-line using standard back-propagation and exact errors. Our results highlight an attractive synergy between pipelined backpropagation and binary-state networks in substantially reducing computation and memory requirements, making pipelined on-line learning practical in deep networks.Comment: Now also consider 0/1 binary activations. Memory access statistics reporte

arXiv.org e-Print Archive

Frontiers - Publisher Connector

eScholarship - University of California

Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems

Author: Cauwenberghs Gert
Das Srinjoy
Kreutz-Delgado Kenneth
Neftci Emre
Pedroni Bruno
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2013
Field of study

Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.Comment: (Under review

arXiv.org e-Print Archive

Crossref

Frontiers - Publisher Connector

PubMed Central

eScholarship - University of California

Recommended from our members

EEG-Based Quantification of Cortical Current Density and Dynamic Causal Connectivity Generalized across Subjects Performing BCI-Monitored Cognitive Tasks.

Author: Cauwenberghs Gert
Courellis Hristos
Iversen John R
Mullen Tim
Poizner Howard
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Quantification of dynamic causal interactions among brain regions constitutes an important component of conducting research and developing applications in experimental and translational neuroscience. Furthermore, cortical networks with dynamic causal connectivity in brain-computer interface (BCI) applications offer a more comprehensive view of brain states implicated in behavior than do individual brain regions. However, models of cortical network dynamics are difficult to generalize across subjects because current electroencephalography (EEG) signal analysis techniques are limited in their ability to reliably localize sources across subjects. We propose an algorithmic and computational framework for identifying cortical networks across subjects in which dynamic causal connectivity is modeled among user-selected cortical regions of interest (ROIs). We demonstrate the strength of the proposed framework using a "reach/saccade to spatial target" cognitive task performed by 10 right-handed individuals. Modeling of causal cortical interactions was accomplished through measurement of cortical activity using (EEG), application of independent component clustering to identify cortical ROIs as network nodes, estimation of cortical current density using cortically constrained low resolution electromagnetic brain tomography (cLORETA), multivariate autoregressive (MVAR) modeling of representative cortical activity signals from each ROI, and quantification of the dynamic causal interaction among the identified ROIs using the Short-time direct Directed Transfer function (SdDTF). The resulting cortical network and the computed causal dynamics among its nodes exhibited physiologically plausible behavior, consistent with past results reported in the literature. This physiological plausibility of the results strengthens the framework's applicability in reliably capturing complex brain functionality, which is required by applications, such as diagnostics and BCI

eScholarship - University of California

Learning Non-deterministic Representations with Energy-based Ensembles

Author: Al-Shedivat Maruan
Cauwenberghs Gert
Neftci Emre
Publication venue
Publication date: 22/12/2014
Field of study

The goal of a generative model is to capture the distribution underlying the data, typically through latent variables. After training, these variables are often used as a new representation, more effective than the original features in a variety of learning tasks. However, the representations constructed by contemporary generative models are usually point-wise deterministic mappings from the original feature space. Thus, even with representations robust to class-specific transformations, statistically driven models trained on them would not be able to generalize when the labeled data is scarce. Inspired by the stochasticity of the synaptic connections in the brain, we introduce Energy-based Stochastic Ensembles. These ensembles can learn non-deterministic representations, i.e., mappings from the feature space to a family of distributions in the latent space. These mappings are encoded in a distribution over a (possibly infinite) collection of models. By conditionally sampling models from the ensemble, we obtain multiple representations for every input example and effectively augment the data. We propose an algorithm similar to contrastive divergence for training restricted Boltzmann stochastic ensembles. Finally, we demonstrate the concept of the stochastic representations on a synthetic dataset as well as test them in the one-shot learning scenario on MNIST.Comment: 9 pages, 3 figures, ICLR-15 workshop contributio

arXiv.org e-Print Archive

eScholarship - University of California